88 research outputs found

    Gene fusions and gene duplications: relevance to genomic annotation and functional analysis

    Get PDF
    BACKGROUND: Escherichia coli a model organism provides information for annotation of other genomes. Our analysis of its genome has shown that proteins encoded by fused genes need special attention. Such composite (multimodular) proteins consist of two or more components (modules) encoding distinct functions. Multimodular proteins have been found to complicate both annotation and generation of sequence similar groups. Previous work overstated the number of multimodular proteins in E. coli. This work corrects the identification of modules by including sequence information from proteins in 50 sequenced microbial genomes. RESULTS: Multimodular E. coli K-12 proteins were identified from sequence similarities between their component modules and non-fused proteins in 50 genomes and from the literature. We found 109 multimodular proteins in E. coli containing either two or three modules. Most modules had standalone sequence relatives in other genomes. The separated modules together with all the single (un-fused) proteins constitute the sum of all unimodular proteins of E. coli. Pairwise sequence relationships among all E. coli unimodular proteins generated 490 sequence similar, paralogous groups. Groups ranged in size from 92 to 2 members and had varying degrees of relatedness among their members. Some E. coli enzyme groups were compared to homologs in other bacterial genomes. CONCLUSION: The deleterious effects of multimodular proteins on annotation and on the formation of groups of paralogs are emphasized. To improve annotation results, all multimodular proteins in an organism should be detected and when known each function should be connected with its location in the sequence of the protein. When transferring functions by sequence similarity, alignment locations must be noted, particularly when alignments cover only part of the sequences, in order to enable transfer of the correct function. Separating multimodular proteins into module units makes it possible to generate protein groups related by both sequence and function, avoiding mixing of unrelated sequences. Organisms differ in sizes of groups of sequence-related proteins. A sample comparison of orthologs to selected E. coli paralogous groups correlates with known physiological and taxonomic relationships between the organisms

    A Genome-Wide Analysis of Promoter-Mediated Phenotypic Noise in Escherichia coli

    Get PDF
    Gene expression is subject to random perturbations that lead to fluctuations in the rate of protein production. As a consequence, for any given protein, genetically identical organisms living in a constant environment will contain different amounts of that particular protein, resulting in different phenotypes. This phenomenon is known as “phenotypic noise.” In bacterial systems, previous studies have shown that, for specific genes, both transcriptional and translational processes affect phenotypic noise. Here, we focus on how the promoter regions of genes affect noise and ask whether levels of promoter-mediated noise are correlated with genes' functional attributes, using data for over 60% of all promoters in Escherichia coli. We find that essential genes and genes with a high degree of evolutionary conservation have promoters that confer low levels of noise. We also find that the level of noise cannot be attributed to the evolutionary time that different genes have spent in the genome of E. coli. In contrast to previous results in eukaryotes, we find no association between promoter-mediated noise and gene expression plasticity. These results are consistent with the hypothesis that, in bacteria, natural selection can act to reduce gene expression noise and that some of this noise is controlled through the sequence of the promoter region alon

    Reduced Selective Constraint in Endosymbionts: Elevation in Radical Amino Acid Replacements Occurs Genome-Wide

    Get PDF
    As predicted by the nearly neutral model of evolution, numerous studies have shown that reduced Ne accelerates the accumulation of slightly deleterious changes under genetic drift. While such studies have mostly focused on eukaryotes, bacteria also offer excellent models to explore the effects of Ne. Most notably, the genomes of host-dependent bacteria with small Ne show signatures of genetic drift, including elevated Ka/Ks. Here, I explore the utility of an alternative measure of selective constraint: the per-site rate of radical and conservative amino acid substitutions (Dr/Dc). I test the hypothesis that purifying selection against radical amino acid changes is less effective in two insect endosymbiont groups (Blochmannia of ants and Buchnera of aphids), compared to related gamma-Proteobacteria. Genome comparisons demonstrate a significant elevation in Dr/Dc in endosymbionts that affects the majority (66–79%) of shared orthologs examined. The elevation of Dr/Dc in endosymbionts affects all functional categories examined. Simulations indicate that Dr/Dc estimates are sensitive to codon frequencies and mutational parameters; however, estimation biases occur in the opposite direction as the patterns observed in genome comparisons, thereby making the inference of elevated Dr/Dc more conservative. Increased Dr/Dc and other signatures of genome degradation in endosymbionts are consistent with strong effects of genetic drift in their small populations, as well as linkage to selected sites in these asexual bacteria. While relaxed selection against radical substitutions may contribute, genome-wide processes such as genetic drift and linkage best explain the pervasive elevation in Dr/Dc across diverse functional categories that include basic cellular processes. Although the current study focuses on a few bacterial lineages, it suggests Dr/Dc is a useful gauge of selective constraint and may provide a valuable alternative to Ka/Ks when high sequence divergences preclude estimates of Ks. Broader application of Dr/Dc will benefit from approaches less prone to estimation biases

    Physiological Roles of ArcA, Crp, and EtrA and Their Interactive Control on Aerobic and Anaerobic Respiration in Shewanella oneidensis

    Get PDF
    In the genome of Shewanella oneidensis, genes encoding the global regulators ArcA, Crp, and EtrA have been identified. All these proteins deviate from their counterparts in E. coli significantly in terms of functionality and regulon. It is worth investigating the involvement and relationship of these global regulators in aerobic and anaerobic respiration in S. oneidensis. In this study, the impact of the transcriptional factors ArcA, Crp, and EtrA on aerobic and anaerobic respiration in S. oneidensis were assessed. While all these proteins appeared to be functional in vivo, the importance of individual proteins in these two major biological processes differed. The ArcA transcriptional factor was critical in aerobic respiration while the Crp protein was indispensible in anaerobic respiration. Using a newly developed reporter system, it was found that expression of arcA and etrA was not influenced by growth conditions but transcription of crp was induced by removal of oxygen. An analysis of the impact of each protein on transcription of the others revealed that Crp expression was independent of the other factors whereas ArcA repressed both etrA and its own transcription while EtrA also repressed arcA transcription. Transcriptional levels of arcA in the wild type, crp, and etrA strains under either aerobic or anaerobic conditions were further validated by quantitative immunoblotting with a polyclonal antibody against ArcA. This extensive survey demonstrated that all these three global regulators are functional in S. oneidensis. In addition, the reporter system constructed in this study will facilitate in vivo transcriptional analysis of targeted promoters

    Fluid Ontologies in the Search for MH370

    Get PDF
    This paper gives an account of the disappearance of Malaysian Airways Flight MH370 into the southern Indian Ocean in March 2014 and analyses the rare glimpses into remote ocean space this incident opened up. It follows the tenuous clues as to where the aeroplane might have come to rest after it disappeared from radar screens – seven satellite pings, hundreds of pieces of floating debris and six underwater sonic recordings – as ways of entering into and thinking about ocean space. The paper pays attention to and analyses this space on three registers – first, as a fluid, more-than-human materiality with particular properties and agencies; second, as a synthetic situation, a composite of informational bits and pieces scopically articulated and augmented; and third, as geopolitics, delineated by the protocols of international search and rescue. On all three registers – as matter, as data and as law – the ocean is shown to be ontologically fluid, a world defined by movement, flow and flux, posing intractable difficulties for human interactions with it

    Catalytic and Non-Catalytic Roles for the Mono-ADP-Ribosyltransferase Arr in the Mycobacterial DNA Damage Response

    Get PDF
    Recent evidence indicates that the mycobacterial response to DNA double strand breaks (DSBs) differs substantially from previously characterized bacteria. These differences include the use of three DSB repair pathways (HR, NHEJ, SSA), and the CarD pathway, which integrates DNA damage with transcription. Here we identify a role for the mono-ADP-ribosyltransferase Arr in the mycobacterial DNA damage response. Arr is transcriptionally induced following DNA damage and cellular stress. Although Arr is not required for induction of a core set of DNA repair genes, Arr is necessary for suppression of a set of ribosomal protein genes and rRNA during DNA damage, placing Arr in a similar pathway as CarD. Surprisingly, the catalytic activity of Arr is not required for this function, as catalytically inactive Arr was still able to suppress ribosomal protein and rRNA expression during DNA damage. In contrast, Arr substrate binding and catalytic activities were required for regulation of a small subset of other DNA damage responsive genes, indicating that Arr has both catalytic and noncatalytic roles in the DNA damage response. Our findings establish an endogenous cellular function for a mono-ADP-ribosyltransferase apart from its role in mediating Rifampin resistance

    Tuning fresh: radiation through rewiring of central metabolism in streamlined bacteria

    Get PDF
    Most free-living planktonic cells are streamlined and in spite of their limitations in functional flexibility, their vast populations have radiated into a wide range of aquatic habitats. Here we compared the metabolic potential of subgroups in the Alphaproteobacteria lineage SAR11 adapted to marine and freshwater habitats. Our results suggest that the successful leap from marine to freshwaters in SAR11 was accompanied by a loss of several carbon degradation pathways and a rewiring of the central metabolism. Examples for these are C1 and methylated compounds degradation pathways, the Entner–Doudouroff pathway, the glyoxylate shunt and anapleuretic carbon fixation being absent from the freshwater genomes. Evolutionary reconstructions further suggest that the metabolic modules making up these important freshwater metabolic traits were already present in the gene pool of ancestral marine SAR11 populations. The loss of the glyoxylate shunt had already occurred in the common ancestor of the freshwater subgroup and its closest marine relatives, suggesting that the adaptation to freshwater was a gradual process. Furthermore, our results indicate rapid evolution of TRAP transporters in the freshwater clade involved in the uptake of low molecular weight carboxylic acids. We propose that such gradual tuning of metabolic pathways and transporters toward locally available organic substrates is linked to the formation of subgroups within the SAR11 clade and that this process was critical for the freshwater clade to find and fix an adaptive phenotype.This work was supported by the Swedish Research Council (Grant Numbers 2012-4592 to AE and 2012-3892 to SB) and the Communiy Sequencing Programme of the US Department of Energy Joint Genome Institute. The work conducted by the US Department of Energy Joint Genome Institute, a DOE Office of Science User Facility, is supported under Contract No. DE-AC02-05CH11231

    An Empirical Strategy for Characterizing Bacterial Proteomes across Species in the Absence of Genomic Sequences

    Get PDF
    Global protein identification through current proteomics methods typically depends on the availability of sequenced genomes. In spite of increasingly high throughput sequencing technologies, this information is not available for every microorganism and rarely available for entire microbial communities. Nevertheless, the protein-level homology that exists between related bacteria makes it possible to extract biological information from the proteome of an organism or microbial community by using the genomic sequences of a near neighbor organism. Here, we demonstrate a trans-organism search strategy for determining the extent to which near-neighbor genome sequences can be applied to identify proteins in unsequenced environmental isolates. In proof of concept testing, we found that within a CLUSTAL W distance of 0.089, near-neighbor genomes successfully identified a high percentage of proteins within an organism. Application of this strategy to characterize environmental bacterial isolates lacking sequenced genomes, but having 16S rDNA sequence similarity to Shewanella resulted in the identification of 300–500 proteins in each strain. The majority of identified pathways mapped to core processes, as well as to processes unique to the Shewanellae, in particular to the presence of c-type cytochromes. Examples of core functional categories include energy metabolism, protein and nucleotide synthesis and cofactor biosynthesis, allowing classification of bacteria by observation of conserved processes. Additionally, within these core functionalities, we observed proteins involved in the alternative lactate utilization pathway, recently described in Shewanella

    Stringent response of Escherichia coli: revisiting the bibliome using literature mining

    Get PDF
    Understanding the mechanisms responsible for cellular responses depends on the systematic collection and analysis of information on the main biological concepts involved. Indeed, the identification of biologically relevant concepts in free text, namely genes, tRNAs, mRNAs, gene products and small molecules, is crucial to capture the structure and functioning of different responses. Results In this work, we review literature reports on the study of the stringent response in Escherichia coli. Rather than undertaking the development of a highly specialised literature mining approach, we investigate the suitability of concept recognition and statistical analysis of concept occurrence as means to highlight the concepts that are most likely to be biologically engaged during this response. The co-occurrence analysis of core concepts in this stringent response, i.e. the (p)ppGpp nucleotides with gene products was also inspected and suggest that besides the enzymes RelA and SpoT that control the basal levels of (p)ppGpp nucleotides, many other proteins have a key role in this response. Functional enrichment analysis revealed that basic cellular processes such as metabolism, transcriptional and translational regulation are central, but other stress-associated responses might be elicited during the stringent response. In addition, the identification of less annotated concepts revealed that some (p)ppGpp-induced functional activities are still overlooked in most reviews. Conclusions In this paper we applied a literature mining approach that offers a more comprehensive analysis of the stringent response in E. coli. The compilation of relevant biological entities to this stress response and the assessment of their functional roles provided a more systematic understanding of this cellular response. Overlooked regulatory entities, such as transcriptional regulators, were found to play a role in this stress response. Moreover, the involvement of other stress-associated concepts demonstrates the complexity of this cellular response
    corecore